255 research outputs found

    A computer model of auditory efferent suppression: Implications for the recognition of speech in noise

    Get PDF
    The neural mechanisms underlying the ability of human listeners to recognize speech in the presence of background noise are still imperfectly understood. However, there is mounting evidence that the medial olivocochlear system plays an important role, via efferents that exert a suppressive effect on the response of the basilar membrane. The current paper presents a computer modeling study that investigates the possible role of this activity on speech intelligibility in noise. A model of auditory efferent processing [ Ferry, R. T., and Meddis, R. (2007). J. Acoust. Soc. Am. 122, 3519?3526 ] is used to provide acoustic features for a statistical automatic speech recognition system, thus allowing the effects of efferent activity on speech intelligibility to be quantified. Performance of the ?basic? model (without efferent activity) on a connected digit recognition task is good when the speech is uncorrupted by noise but falls when noise is present. However, recognition performance is much improved when efferent activity is applied. Furthermore, optimal performance is obtained when the amount of efferent activity is proportional to the noise level. The results obtained are consistent with the suggestion that efferent suppression causes a ?release from adaptation? in the auditory-nerve response to noisy speech, which enhances its intelligibility

    A frequency-selective feedback model of auditory efferent suppression and its implications for the recognition of speech in noise

    Get PDF
    The potential contribution of the peripheral auditory efferent system to our understanding of speech in a background of competing noise was studied using a computer model of the auditory periphery and assessed using an automatic speech recognition system. A previous study had shown that a fixed efferent attenuation applied to all channels of a multi-channel model could improve the recognition of connected digit triplets in noise [G. J. Brown, R. T. Ferry, and R. Meddis, J. Acoust. Soc. Am. 127, 943?954 (2010)]. In the current study an anatomically justified feedback loop was used to automatically regulate separate attenuation values for each auditory channel. This arrangement resulted in a further enhancement of speech recognition over fixed-attenuation conditions. Comparisons between multi-talker babble and pink noise interference conditions suggest that the benefit originates from the model?s ability to modify the amount of suppression in each channel separately according to the spectral shape of the interfering sounds

    Adjustment of interaural-time-difference analysis to sound level

    Get PDF
    To localize low-frequency sound sources in azimuth, the binaural system compares the timing of sound waves at the two ears with microsecond precision. A similarly high precision is also seen in the binaural processing of the envelopes of high-frequency complex sounds. Both for low- and high-frequency sounds, interaural time difference (ITD) acuity is to a large extent independent of sound level. The mechanisms underlying this level-invariant extraction of ITDs by the binaural system are, however, only poorly understood. We use high-frequency pip trains with asymmetric and dichotic pip envelopes in a combined psychophysical, electrophysiological, and modeling approach. Although the dichotic envelopes cannot be physically matched in terms of ITD, the match produced perceptually by humans is very reliable, and it depends systematically on the overall sound level. These data are reflected in neural responses from the gerbil lateral superior olive and lateral lemniscus. The results are predicted in an existing temporal-integration model extended with a level-dependent threshold criterion. These data provide a very sensitive quantification of how the peripheral temporal code is conditioned for binaural analysis

    The psychophysics of absolute threshold and signal duration: A probabilistic approach

    Get PDF
    The absolute threshold for a tone depends on its duration; longer tones have lower thresholds. This effect has traditionally been explained in terms of ?temporal integration? involving the summation of energy or perceptual information over time. An alternative probabilistic explanation of the process is formulated in terms of simple equations that predict not only the time=duration dependence but also the shape of the psychometric function at absolute threshold. It also predicts a tight relationship between these two functions. Measurements made using listeners with either normal or impaired hearing show that the probabilistic equations adequately fit observed threshold-duration functions and psychometric functions. The mathematical formulation implies that absolute threshold can be construed as a two-valued function: (a) gain and (b) sensory threshold, and both parameters can be estimated from threshold-duration data. Sensorineural hearing impairment is sometimes associated with a smaller threshold=duration effect and sometimes with steeper psychometric functions. The equations explain why these two effects are expected to be linked. The probabilistic approach has the potential to discriminate between hearing deficits involving gain reduction and those resulting from a raised sensory threshold

    Tinnitus and Patterns of Hearing Loss

    Get PDF
    Tinnitus is strongly linked with the presence of damaged hearing. However, it is not known why tinnitus afflicts only some, and not all, hearingimpaired listeners. One possibility is that tinnitus patients have specific inner ear damage that triggers tinnitus. In this study, differences in cochlear function inferred from psychophysical measures were measured between hearing-impaired listeners with tinnitus and hearing-impaired listeners without tinnitus. Despite having similar average hearing loss, tinnitus patients were observed to have better frequency selectivity and compression than those without tinnitus. The results suggest that the presence of subjective tinnitus may not be strongly associated to outer hair cell impairment, at least where hearing impairment is evident. The results also show a different average pattern of hearing impairment amongst the tinnitus patients, consistent with the suggestion that inner hair cell dysfunction with subsequent reduce

    Optimizing Speech Recognition Using a Computational Model of Human Hearing: Effect of Noise Type and Efferent Time Constants

    Get PDF
    Physiological and psychophysical methods allow for an extended investigation of ascending (afferent) neural pathways from the ear to the brain in mammals, and their role in enhancing signals in noise. However, there is increased interest in descending (efferent) neural fibers in the mammalian auditory pathway. This efferent pathway operates via the olivocochlear system, modifying auditory processing by cochlear innervation and enhancing human ability to detect sounds in noisy backgrounds. Effective speech intelligibility may depend on a complex interaction between efferent time-constants and types of background noise. In this study, an auditory model with efferent-inspired processing provided the front-end to an automatic-speech-recognition system (ASR), used as a tool to evaluate speech recognition with changes in time-constants (50 to 2000 ms) and background noise type (unmodulated and modulated noise). With efferent activation, maximal speech recognition improvement (for both noise types) occurred for signal-to-noise ratios around 10 dB, characteristic of real-world speech-listening situations. Net speech improvement due to efferent activation (NSIEA) was smaller in modulated noise than in unmodulated noise. For unmodulated noise, NSIEA increased with increasing time-constant. For modulated noise, NSIEA increased for time-constants up to 200 ms but remained similar for longer time-constants, consistent with speech-envelope modulation times important to speech recognition in modulated noise. The model improves our understanding of the complex interactions involved in speech recognition in noise, and could be used to simulate the difficulties of speech perception in noise as a consequence of different types of hearing loss

    The representation of speech in a nonlinear auditory model: time-domain analysis of simulated auditory-nerve firing patterns

    Get PDF
    A nonlinear auditory model is appraised in terms of its ability to encode speech formant frequencies in the fine time structure of its output. It is demonstrated that groups of model auditory nerve (AN) fibres with similar interpeak intervals accurately encode the resonances of synthetic three-formant syllables, in close agreement with physiological data. Acoustic features are derived from the interpeak intervals and used as the input to a hidden Markov model-based automatic speech recognition system. In a digits-in-noise recognition task, interval-based features gave a better performance than features based on AN firing rate at every signal-to-noise ratio tested

    The robustness of speech representations obtained from simulated auditory nerve fibers under different noise conditions

    Get PDF
    Different methods of extracting speech features from an auditory model were systematically investigated in terms of their robustness to different noises. The methods either computed the average firing rate within frequency channels (spectral features) or inter-spike-intervals (timing features) from the simulated auditory nerve response. When used as the front-end for an automatic speech recognizer, timing features outperformed spectral features in Gaussian noise. However, this advantage was lost in babble, because timing features extracted the spectro-temporal structure of babble noise, which is similar to the target speaker. This suggests that different feature extraction methods are optimal depending on the background noise
    corecore